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Abstract 

Orthogonal  Matching  Pursuit  (OMP)  is  the  canonical  greedy  algorithm  for  sparse  approxi¬ 
mation.  In  this  paper  we  demonstrate  that  the  restricted  isometry  property  (RIP)  can  be  used 
for  a  very  straightforward  analysis  of  OMP.  Our  main  conclusion  is  that  the  RIP  of  order  K  +  1 
(with  isometry  constant  5  <  is  sufficient  for  OMP  to  exactly  recover  any  A'-sparse  sig¬ 

nal.  Our  analysis  relies  on  simple  and  intuitive  observations  about  OMP  and  matrices  which 
satisfy  the  RIP.  For  restricted  classes  of  AT-sparse  signals  (those  that  are  highly  compressible), 
a  relaxed  bound  on  the  isometry  constant  is  also  established.  A  deeper  understanding  of  OMP 
may  benefit  the  analysis  of  greedy  algorithms  in  general.  To  demonstrate  this,  we  also  briefly 
revisit  the  analysis  of  the  Regularized  OMP  (ROMP)  algorithm. 


1  Introduction 

1.1  Orthogonal  Matching  Pursuit 

Orthogonal  Matching  Pursuit  (OMP)  is  the  canonical  greedy  algorithm  for  sparse  approximation. 
Letting  <1  denote  a  matrix  of  size  M  x  N  (where  typically  M  <  N)  and  y  denote  a  vector  in  RM,  the 
goal  of  OMP  is  to  recover  a  coefficient  vector  x  G  M.N  with  roughly  K  <  M  nonzero  terms  so  that 
Tx  equals  y  exactly  or  approximately.  OMP  is  frequently  used  to  find  sparse  representations  for 
signals  y  G  in  settings  where  $  represents  an  overcomplete  dictionary  for  the  signal  space  [1 
3].  It  is  also  commonly  used  in  compressive  sensing  (CS),  where  y  =  <&x  represents  compressive 
measurements  of  a  sparse  or  nearly-sparse  signal  x  G  M.N  to  be  recovered  [4-6]. 

One  of  the  attractive  features  of  OMP  is  its  simplicity.  The  entire  algorithm  is  specified  in 
Algorithm  1,  and  it  requires  approximately  the  same  number  of  lines  of  code  to  implement  in  a 
software  package  such  as  Matlab.  Despite  its  simplicity,  OMP  is  empirically  competitive  in  terms 
of  approximation  performance  [3,7]. 

Theoretical  analysis  of  OMP  to  date  has  concentrated  primarily  on  two  fronts.  The  first  has 
involved  the  notion  of  a  coherence  parameter  y  :=  max,;  j  \  (4>i,  (j>j)\,  where  </>,  denotes  column  i  of  the 
matrix  4*.  When  the  columns  of  have  unit  norm  and  y  <  2k-i  ;  it  has  been  shown  [3]  that  OMP 
will  recover  any  A'-sparse  signal  x  from  the  measurements  y  =  4?x.  This  guarantee  is  deterministic 
and  applies  to  any  matrix  <f>  having  normalized  columns  and  y  <  2K—i- 
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Algorithm  1  Orthogonal  Matching  Pursuit 
input:  $,2/,  stopping  criterion 
initialize:  r°  =  y,  x°  =  0,  A0  =  0,  l  =  0 
while  not  converged  do 
match:  hr  =  <kTre 

identify:  A^+1  =  U  {argmaxj  |/r(j)|}  (if  multiple  maxima  exist,  choose  only  one) 
update:  x£+1  =  argmin2.  supp(z)ca«+i \\v  ~  $zh 

— —  y  _ 

e  =  e  +  i 

end  while 

output:  x  =  xf  =  argmin2.  supp(2)cA^ \\v  ~  $zh 


The  second  analytical  front  has  involved  the  notion  of  probability.  Suppose  x  £  Kw  with 
||x||o  :=  |supp(x)|  <  K  and  that  $  is  drawn  from  a  suitable  random  distribution  (independently 
of  x)  with  M  =  0(K  log(N))  rows.  Then  with  high  probability,  OMP  will  recover  x  exactly  from 
the  measurements  y  =  <bx  [6].  It  is  not  guaranteed,  however,  that  any  such  fixed  matrix  will  allow 
recovery  of  all  sparse  x  simultaneously. 


1.2  The  Restricted  Isometry  Property 


As  an  alternative  to  coherence  and  to  probabilistic  analysis,  a  large  number  of  algorithms  within  the 
broader  field  of  CS  have  been  studied  using  the  restricted  isometry  property  (RIP)  for  the  matrix 
[8].  A  matrix  <1  satisfies  the  RIP  of  order  K  if  there  exists  a  constant  5  £  (0, 1)  such  that 

(l-<5)||x||I<||$x||I<(l  +  <5)||x||I  (1) 


holds  for  all  x  such  that  ||x||o  <  K.  In  other  words,  $  acts  as  an  approximate  isometry  on  the 
set  of  vectors  that  are  Jt-sparse.  Much  is  known  about  finding  matrices  that  satisfy  the  RIP.  For 
example,  if  we  draw  a  random  M  x  N  matrix  whose  entries  faj  are  independent  and  identically 
distributed  sub-Gaussian  random  variables,  then  provided  that 


M  =  O 


^KlogjN/iq^ 


(2) 


with  high  probability  $  will  satisfy  the  RIP  of  order  K  [9, 10]. 

When  it  is  satisfied,  the  RIP  for  a  matrix  <1  provides  a  sufficient  condition  to  guarantee  successful 
sparse  recovery  using  a  wide  variety  of  algorithms  [8,11-19].  As  an  example,  the  RIP  of  order 
2K  (with  isometry  constant  5  <  y/2  —  1)  is  a  sufficient  condition  to  permit  G -minimization  (the 
canonical  convex  optimization  problem  for  sparse  approximation)  to  exactly  recover  any  A-sparse 
signal  and  to  approximately  recover  those  that  are  nearly  sparse  [11].  The  same  RIP  assumption 
is  also  a  sufficient  condition  for  robust  recovery  in  noise  using  a  modified  G -minimization  [11] . 

Despite  the  considerable  attention  that  has  been  paid  to  both  OMP  and  the  RIP,  analysis 
of  OMP  using  the  RIP  has  been  relatively  elusive  to  date.  However,  several  alternative  greedy 
algorithms  have  been  proposed — all  essentially  modifications  of  OMP — that  are  apparently  much 
more  amenable  to  RIP-based  analysis.  The  Regularized  Orthogonal  Matching  Pursuit  (ROMP)  [13, 
14]  and  Subspace  Pursuit  (SP)  [16]  algorithms  differ  from  OMP  in  the  identification  step,  while  the 
Compressive  Sampling  Matching  Pursuit  (CoSaMP)  [15]  and  DThresh  [17]  algorithms  differ  from 
OMP  in  both  the  identification  and  the  update  steps.  For  each  of  these  algorithms  it  has  been 
shown  that  the  RIP  of  order  CK  (where  C  >  2  is  a  constant  depending  on  the  algorithm)  with  5 
adequately  small  is  sufficient  for  exact  recovery  of  K  sparse  signals. 
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1.3  Contributions 


Our  contributions  in  this  paper  are  twofold.  First,  we  begin  in  Section  2  with  some  very  simple 
observations  regarding  OMP.  Many  of  these  facts  are  known  to  practitioners  in  the  field  but  may 
not  be  obvious  to  a  novice,  and  we  feel  that  such  readers  may  find  value  in  a  short  exposition. 

Critically,  these  observations  also  set  the  stage  for  our  main  results  in  Section  3,  in  which  we 
demonstrate  that  the  RIP  can  be  used  for  a  very  straightforward  analysis  of  OMP.  Our  analysis 
revolves  around  three  key  facts:  (1)  that  in  each  step  of  the  algorithm,  the  residual  vector  r{  can 
be  written  as  a  matrix  times  a  sparse  signal,  (2)  that  this  matrix  satisfies  the  RIP,  and  (3)  that 
consequently  a  sharp  bound  can  be  established  for  the  vector  h *  of  inner  products.  Our  main 
conclusion,  Theorem  3.1,  states  that  the  RIP  of  order  K  +  1  (with  S  <  ^=)  is  sufficient  for  OMP 
to  exactly  recover  any  A- sparse  signal  in  exactly  A  iterations.  However,  for  restricted  classes  of 
A-sparse  signals  (those  with  sufficiently  strong  decay  in  the  nonzero  coefficients),  a  relaxed  bound 
on  the  isometry  constant  can  be  used.  We  discuss  such  extensions  of  our  results  in  Section  4.  A 
deeper  understanding  of  OMP  may  also  benefit  the  analysis  of  greedy  algorithms  in  general.  To 
demonstrate  this,  we  briefly  revisit  the  analysis  of  the  ROMP  algorithm  in  Section  4. 

1.4  Context 

Let  us  place  Theorem  3.1  in  the  context  of  the  OMP  literature.  Using  the  RIP  as  a  sufficient 
condition  to  guarantee  OMP  performance  is  apparently  novel.  Moreover,  the  fact  that  our  bound 
requires  only  the  RIP  of  order  A  + 1  is  apparently  unique  among  the  published  CS  literature;  much 
more  common  are  results  requiring  the  RIP  of  order  1.75A  [12],  2 A  [11, 13],  3 A  [16, 18],  4 A  [15], 
and  so  on.  Of  course,  such  results  often  permit  the  isometry  constant  to  be  much  larger.1 

If  one  wishes  to  use  the  RIP  of  order  A  +  1  as  a  sufficient  condition  for  exact  recovery  of  all 
A-sparse  signals  via  OMP  (as  we  have),  then  little  improvement  is  possible  in  relaxing  the  isometry 
constant  5  above  C.  In  particular,  there  exists  a  matrix  satisfying  the  RIP  of  order  A  +  1  with 

5  <  for  which  there  exists  a  A-sparse  signal  x  €  that  cannot  be  recovered  exactly  via  A 
iterations  of  OMP.  (This  is  conjectured  in  [16]  with  a  suggestion  for  constructing  such  a  matrix, 
and  for  the  case  K  =  2  we  have  confirmed  this  via  experimentation.) 

Unfortunately,  from  (2)  we  see  that  finding  a  matrix  <f>  satisfying  the  RIP  of  order  K  +  1  with 
an  isometry  constant  5  <  — may  require  M  =  0(K2  log (IV/ A ))  random  measurements.  If 
one  wishes  to  guarantee  exact  recovery  of  all  A -sparse  signals  via  OMP  (as  we  have),  then  little 
improvement  is  possible  in  relaxing  this  number.  In  particular,  it  has  been  argued  [20]  that  when 
M  <  A"3/2,  for  most  random  M  X  N  matrices  <J>  there  will  exist  some  A"-sparse  signal  x  6  that 
cannot  be  recovered  exactly  via  A  iterations  of  OMP. 

It  is  also  worth  comparing  our  RIP-based  analysis  with  coherence-based  analysis  [3],  as  both 
techniques  provide  a  sufficient  condition  for  OMP  to  recover  all  A"-sparse  signals.  It  has  been 
shown  [6]  that  in  a  random  M  x  N  matrix,  the  coherence  parameter  //  is  unlikely  to  be  smaller 
than  log (N)/\fM.  Thus,  to  ensure  fi  <  2k_\  ,  one  requires  M  =  0(K2  log2 (N)),  which  is  roughly 
the  same  as  what  is  required  by  our  analysis.  We  note  that  neither  result  is  strictly  stronger 
than  the  other;  we  have  confirmed  experimentally  that  there  exist  matrices  that  satisfy  our  RIP 
condition  but  not  the  coherence  condition,  and  vice  versa. 

Finally,  we  note  that  the  aforementioned  modifications  of  OMP  (the  ROMP,  SP,  CoSaMP,  and 
DThresh  algorithms)  all  have  RIP-based  guarantees  of  robust  recovery  in  noise  and  stable  recovery 

1Note  that  a  smaller  order  of  the  RIP  is  not  necessarily  a  weaker  requirement  if  the  required  constant  is  also 
significantly  smaller.  For  example,  Corollary  3.4  of  [15]  implies  that  if  <t>  satisfies  the  RIP  of  order  K  +  1  with 
constant  5,  then  4>  also  satisfies  the  RIP  of  order  2K  with  constant  4 S. 
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of  non-sparse  signals.  To  date,  no  such  RIP-based  or  coherence-based  guarantees  have  been  provided 
for  OMP  itself.  We  speculate  that  our  perspective  may  help  to  further  the  understanding  of  OMP 
and  perhaps  provide  a  route  to  such  a  guarantee.  At  present,  however,  this  remains  a  topic  of 
ongoing  work. 

1.5  Notation 

Before  proceeding,  we  set  our  notation.  Suppose  A  C  {1,  2, ...  ,  N}.  We  let  Ac  =  {1,2,...  ,N}\A. 
By  x|a  we  mean  the  length  |A|  vector  containing  the  entries  of  x  indexed  by  A. 

By  we  mean  the  M  x  |A|  matrix  obtained  by  selecting  the  columns  of  $  indexed  by  A,  and 
by  K($a)  we  mean  the  range,  or  column  space,  of  <La-  We  will  assume  throughout  that  when 
|A|  <  M,  <3?a  is  full  rank,  in  which  case  we  let  :=  ($J$a)_1^a  denote  the  Moore-Penrose 
pseudoinverse  of  3>a- 

We  denote  the  orthogonal  projection  operator  onto  7£(3>a)  by  Pa  :=  ^a^a-  Similarly,  P^  := 
(I  —  Pa)  is  the  orthogonal  projection  operator  onto  the  orthogonal  complement  of  P(1>a)-  We  note 
that  any  orthogonal  projection  operator  P  obeys  P  =  P 1  =  P 2 . 

Finally,  we  define  :=  P  <I> .  This  matrix  is  the  result  of  orthogonalizing  the  columns  of 
against  P($a)-  It  is  therefore  equal  to  zero  on  columns  indexed  by  A. 

2  Observations 

Let  us  begin  with  some  very  simple  observations  regarding  OMP  as  presented  in  Algorithm  1. 
The  key  idea  is  to  try  to  iteratively  estimate  a  set  A  that  contains  the  locations  of  the  nonzeros 
of  x  by  starting  with  A  =  0  and  then  adding  a  new  element  to  A  in  each  iteration.  In  order 
to  select  which  element  to  add,  the  algorithm  also  maintains  a  residual  vector  r  ^  P($a)  that 
represents  the  component  of  the  measurement  vector  y  that  cannot  be  explained  by  the  columns 
of  <1>a-  Specifically,  at  the  beginning  of  the  £th  iteration,  A£  is  our  current  estimate  of  supp(x),  and 
the  residual  is  defined  as  r*-  =  y  —  <f»ar  where  supp(ar)  C  A  .  The  element  added  to  A^  is  the 
index  of  the  column  of  <1?  that  has  the  largest  inner  product  with  r  . 

Our  first  observation  is  that  can  be  viewed  as  the  orthogonalization  of  y  against  the  previously 
chosen  columns  of  <L.  To  see  this,  note  that  the  solution  to  the  least  squares  problem  in  the  update 
step  is  given  by 

®*Ia  t=®\tV  and  x*|(A<)c  =  0.  (3) 

Thus  we  observe  that 

re  =  y-$xe  =  y-  $A =  (J  -  p^)v  =  pmV- 

Note  that  it  is  not  actually  necessary  to  explicitly  compute  xe  in  order  to  calculate  r  . 

Our  second  observation  is  that,  in  the  matching  step,  one  may  correlate  either  with  the 
columns  of  or  with  the  columns  of  A^t.  To  see  this  equivalence,  observe  that  r£  =  P^ y  = 

pltphy =  (pA*)TpA *y and  so 

he  =  $Tre  =  QT(pjL)Tp±.ty  =  Ary  (4) 

Incidentally,  along  these  same  lines  we  observe  that 

he  =  $Tre  =  Tp±y  =  ^(P^fy  =  ATey 
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From  this  we  note  that  it  is  not  actually  necessary  to  explicitly  compute  m  in  order  to  calculate 
the  inner  products  during  the  matching  step;  in  fact,  the  original  formulation  of  OMP  was  stated 
with  instructions  to  orthogonalize  the  remaining  columns  of  <f>  against  those  previously  chosen 
and  merely  correlate  the  resulting  vectors  against  y  [1,  2],  Additionally,  we  recall  that,  in  Aai,  all 
columns  indexed  by  Af  will  be  zero.  It  follows  that 

h\j)  =  0  Vj  G  Ae,  (5) 

and  so,  since  Ae  =  Af_1  U  {j* }  with  j*  £  Af_1, 

|A£|  =  L  (6) 

Our  third  observation  is  that,  in  the  case  of  noise-free  measurements  y  =  we  may  write 

rl  =  P^y  =  P^A’x  =  AAex. 

Again  recalling  that  all  columns  of  Aai  indexed  by  Ae  are  zero,  we  thus  note  that  when  supp(.x)  C 
A  ,  re  =  0,  and  from  (3)  we  also  know  that  =  x  exactly.  It  will  also  be  useful  to  note  that  for 


the  same  reason,  we  can  also  write 

re  =  A^ex^, 

(7) 

where 

HI 

> 

II 

o 

and  x\^Aty  =  x\^Aiy. 

(8) 

3  Analysis 

Our  analysis  of  OMP  will  center  on  the  vector  hr .  In  light  of  (4)  and  (7),  we  see  that  Aai  plays  a 
role  both  in  constructing  and  in  analyzing  the  residual  vector.  In  Lemma  3.2  below,  we  show  that 
the  matrix  Aai  satisfies  a  modified  version  of  the  RIP.  This  allows  us  to  very  precisely  bound  the 
values  of  the  inner  products  in  the  vector  if. 

We  begin  with  two  elementary  lemmas  whose  proofs  are  given  in  the  Appendix.  Our  first 
result,  which  is  a  straightforward  generalization  of  Lemma  2.1  of  [11],  states  that  RIP  operators 
approximately  preserve  inner  products  between  sparse  vectors. 

Lemma  3.1  Let  u,  v  G  M.N  be  given,  and  suppose  that  a  matrix  T  satisfies  the  RIP  of  order 
max(||it  +  u||o,  || u  —  v\\o)  with  isometry  constant  5.  Then 

|(Tu,  4'u)  -  (u,v) |  <  <5||'u||2|M|2.  (9) 

One  consequence  of  this  result  is  that  sparse  vectors  that  are  orthogonal  in  M.N  remain  nearly 
orthogonal  after  the  application  of  T.  From  this  observation,  it  was  demonstrated  independently 
in  [21]  and  [16]  that  if  has  the  RIP,  then  A\  satisfies  a  modified  version  of  the  RIP. 

Lemma  3.2  Suppose  that  satisfies  the  RIP  of  order  K  with  isometry  constant  5,  and  let  A  C 
{1,2,...  ,  N}.  If  |A|  <  K  then 

(i  -  ^  II^A«||i  <  (!  +  <5)IMII  (10) 

for  all  u  G  BA  such  that  ||u||o  <  K  —  |A|  and  supp(u)  0  A  =  0. 
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In  other  words,  if  <b  satisfies  the  RIP  of  order  I\ ,  then  Aa  acts  as  an  approximate  isometry  on 
every  (K  —  |A|)-sparse  vector  supported  on  Ac.  From  (7),  we  recall  that  the  residual  vector  in  OMP 
is  formed  by  applying  AAt  to  a  sparse  vector  supported  on  (A^)c.  Combining  the  above  results, 
then,  we  may  bound  the  inner  products  hr(j)  as  follows. 

Lemma  3.3  Let  A  C  {1, 2, ...  ,  N}  and  suppose  x  £  HN  with  supp(af)  n  A  =  0.  Define 

h  =  A^Aax.  (11) 

Then  if  &  satisfies  the  RIP  of  order  ||5?||o  +  |A|  +  1  with  isometry  constant  5,  we  have 

\h(j)  ~x(j)\  <  y^—\\x\\2  (12) 

for  all  j  fi  A. 

Proof:  From  Lemma  3.2  we  have  that  the  restriction  of  A\  to  the  columns  indexed  by  Ac 
satisfies  the  RIP  of  order  (||5?||o  +  |A|  +  1)  —  |A|  =  ||x||o  +  1  with  isometry  constant  5/{  1  —  6).  By 
the  definition  of  h,  we  also  know  that 


h(j)  =  {AAx,AAefi, 

where  e,j  denotes  the  jth  vector  from  the  cardinal  basis.  Now,  suppose  j  A.  Then  because 
\\x  ±  ej  ||o  <  p||o  +  1  and  supp(x  ±  efi)  n  A  =  0,  we  conclude  from  Lemma  3.1  that 

£ 

\h(j)  ~x{j)\  =  \{AAx,AAej)  -  (x,ej)\  <  Y^\\^h\\ej h- 

Noting  that  1 1  e y  1 1 2  =  1,  we  reach  the  desired  conclusion.  □ 

With  this  bound  on  the  inner  products  he(j),  we  may  derive  a  sufficient  condition  under  which 
the  identification  step  of  OMP  will  succeed. 


Corollary  3.1  Suppose  that  A,  d>,  x  meet  the  assumptions  specified  in  Lemma  3.3,  and  let  h  he  as 
defined  in  (11).  If 

2<5  I, _ I.  .  , 

M  00  >  1 — 7  M  2>  (13) 

l  —  o 

we  are  guaranteed  that  argmaxj  \h(j)\  €  supp(x). 


Proof:  If  (12)  is  satisfied,  then  for  indices  j  ^  supp(x),  we  will  have  \h(j)\  <  yryll^lb- 
(Recall  from  (5)  that  h(j )  =  0  for  j  £  A.)  If  (13)  is  satisfied,  then  there  exists  some  j  £  supp(rc) 
with  \x(j)\  >  j^||s||2.  From  (12)  and  the  triangle  inequality,  we  conclude  that  for  this  index  j, 

\h(j)\  >  iryplb-  □ 

By  choosing  5  small  enough,  it  is  possible  to  guarantee  that  the  condition  (13)  is  satisfied.  In 
particular,  the  lemma  below  follows  from  standard  arguments. 


Lemma  3.4  For  any  u  £  MN ,  ||u||oo  >  IMb/viMlo- 


Putting  these  results  together,  we  can  now  establish  our  main  theorem  concerning  OMP. 

Theorem  3.1  Suppose  that  $  satisfies  the  RIP  of  order  K  +  1  with  isometry  constant  5  <  ^4^- 
Then  for  any  x  £  M.N  with  ||x||o  <  K,  OMP  will  recover  x  exactly  from  y  =  <&x  in  K  iterations. 
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Proof:  The  proof  works  by  induction.  We  start  with  the  first  iteration  where  hP  =  $T<h.T  and 
note  that  =  Aq.  Because  ||x||o  <  K .  Lemma  3.4  states  that  ||x||oo  >  Jj4=.  One  can  also  check 

that  5  <  7^^  implies  that  Therefore,  we  are  guaranteed  that  (13)  is  satisfied,  and  so 

from  Corollary  3.1  we  conclude  that  argmaxj  |/i°(j)|  G  supp(x). 

We  now  consider  the  general  induction  step.  Suppose  that  we  are  at  iteration  I  and  that  all 
previous  iterations  have  succeeded,  by  which  we  mean  that  Af  C  supp(x).  From  (8),  we  know  that 
supp(5r)  n  Ae  =  0  and  that  ||ar||o  <  K  —  i.  From  (6),  we  know  that  |A£|  =  I.  By  assumption,  $ 
satisfies  the  RIP  of  order  K  +  1  =  (K  —  £)  +  I  +  1  >  ||x^||o  +  |A£|  +  1.  Finally,  using  Lemma  3.4, 
we  have  that 


\rl\\  > 

|  ^  ||  OO  _ 


VK  — 


> 


1 1  1 1 2 
Vk 


> 


26 

1-6 


*  2- 


From  Corollary  3.1  we  conclude  that  argmaxj  |/r(j)|  G  supp(x^)  and  hence  A^+1  C  supp(x).  □ 


4  Extensions 


4.1  Strongly-decaying  sparse  signals 

For  even  moderate  values  of  the  isometry  constant  6  there  exist  sparse  signals  that  we  can  ensure 
are  recovered  exactly.  For  example,  if  the  decay  of  coefficients  is  sufficiently  strong  in  a  sparse 
signal,  we  may  use  Lemma  3.3  to  ensure  that  the  signal  entries  are  recovered  in  the  order  of  their 
magnitude. 

For  any  x  G  with  ||x||o  <  K  we  denote  by  x'(j)  the  entries  of  x  ordered  by  magnitude,  i.e. , 

|x'(l)|  >  |x'(2)|  >  •••  >  \x\K)\  >  0 
with  x'(K  +  1)  =  x'(K  +  2)  =  •  •  •  =  x'(N)  =  0. 


Theorem  4.1  Suppose  that  T  satisfies  the  RIP  of  order  K  +  1  with  isometry  constant  6  < 
Suppose  x  G  with  ||x||o  <  K  and  that  for  all  j  G  {1, 2, . . .  ,  K  —  1}, 


WU)\ 

x'(j  +  1)| 


>  a. 


If 

„  !  +  2 

a >  1-2-*- 
1  zi-s 

then  OMP  will  recover  x  exactly  from  y  =  4>x  in  K  iterations. 


(14) 


Proof:  The  proof  again  proceeds  by  induction.  At  each  stage,  OMP  will  choose  the  largest 
entry  of  if.  To  see  this,  note  that  by  (12)  we  have  |  h?(j)  —  xf(j)  |  <  ll^lb-  The  nonzero  entries 
of  Xs-  will  be  comprised  of  x\l  +  1  ),x'{l  +  2), . . . ,  x\K).  Thus, 


2  <  \  \x'{l  +  1)|2  +  (K  —  1) 


|.t'(£+1)|2  \x'{t+l)\ 


or 


a 


+  (K  -  1)  < 


\x'(l  +  1)| 


a 


(a+VK^l). 


Now,  for  the  largest  entry  of  xf,  we  have 


|^(j)l  >  +  1)|  -  -j— ~-y  ^  ^  +  1^(«  +  VK  -  1)  =  ^  —  +  ^  (a  -  -  — — y (cr  +  VK  -  1))  (15) 

1  —  da  a  l  —  o 


7 


while  for  all  other  entries  we  have 


\h\j)\  <  \x'{l  +  2)|  +  6  \X'[i+1)\a  +  VK^l)  <  |x/(£  +  1)l(l  +  -^-(a  +  y/JT= T)).  (16) 

1  —  da  a  1  —  0 

From  (14),  it  follows  that  (15)  is  greater  than  (16).  □ 

4.2  Analysis  of  other  orthogonal  greedy  algorithms 

We  now  demonstrate  that  the  techniques  used  above  can  also  be  used  to  analyze  other  orthogonal 
greedy  algorithms.  We  focus  on  ROMP  [13, 14]  for  the  purpose  of  illustration,  but  similar  methods 
should  be  able  to  simplify  the  analysis  of  other  orthogonal  greedy  algorithms  such  as  SP  [16]. 2 

We  first  briefly  describe  the  difference  between  ROMP  and  OMP,  which  lies  only  in  the  iden¬ 
tification  step:  whereas  OMP  adds  only  one  index  to  at  each  iteration,  ROMP  adds  up  to  K 
indices  to  at  each  iteration.  Specifically,  ROMP  first  selects  the  indices  corresponding  to  the  K 
largest  elements  in  magnitude  of  h ^  (or  all  nonzero  elements  of  he  if  hf  has  fewer  than  K  nonzeros), 
and  denotes  this  set  as  Jr.  The  next  step  is  to  regularize  this  set  so  that  the  values  are  comparable 
in  magnitude.  To  do  this,  define  R(Qe)  :=  {0  C  :  \hf{i)\  <  2\he(j)\  \/i,j  G  11},  and  set 

Qe0  :=  argmaxng^jH^Inlla, 

i.e.,  fig  is  the  set  with  maximal  energy  among  all  regularized  subsets  of  Finally,  setting  A^+1  = 

A^  U  Jig,  the  remainder  of  the  ROMP  algorithm  is  identical  to  OMP. 

In  order  to  analyze  ROMP,  we  will  need  only  two  preliminary  lemmas  from  [13],  which  we  state 
without  proof.  Note  that  Lemma  4.1,  which  is  essentially  a  generalization  of  Lemma  3.3,  is  stated 
using  slightly  weaker  assumptions  than  those  stated  in  [13].  The  present  version  can  easily  be 
obtained  using  the  same  proof. 

Lemma  4.1  ((1)  in  Prop.  3.2  of  [13])  Let  F  c  {1,2, ...  ,N}  and  x  G  be  given.  Then  if  T 
satisfies  the  RIP  of  order  |supp(x)  U  T|  with  isometry  constant  5,  we  have 

||(4'rT.'c)|r  -  x|r||2  <  <5||x||2. 

Lemma  4.2  (Lemma  3.7  of  [13])  Let  u  G  ,  K  >  1,  be  arbitrary.  Then  there  exists  a  subset 
r  C  {1, . . .  ,K}  such  that  |rt(i)|  <  2|it(j)|  for  all  i,j  G  T  and 

1 

Mr  2  > -  \\u  h- 

Using  these  lemmas,  we  now  provide  a  simplified  proof  of  the  main  result  of  [13]  concerning  the 
recovery  of  sparse  signals  using  ROMP.3 

Theorem  4.2  Suppose  that  T  satisfies  the  RIP  of  order  3 K  with  isometry  constant  5  <  0.13/ yTog2  K . 
Then  for  any  x  G  with  ||x||o  <  K ,  ROMP  will  recover  x  exactly  from  y  =  4>x  in  at  most  K 
iterations. 

2Some  of  the  greedy  algorithms  that  have  been  proposed  recently,  such  as  CoSaMP  [15]  and  DThresh  [17],  do 
not  orthogonalize  the  residual  against  the  previously  chosen  columns  at  each  iteration,  and  so  the  techniques  above 
cannot  be  directly  applied  to  these  algorithms.  However,  this  orthogonalization  step  could  easily  be  added  (which  in 
the  case  of  CoSaMP  yields  an  algorithm  nearly  identical  to  SP).  Orthogonalized  versions  of  these  algorithms  could 
then  be  studied  using  these  techniques. 

3Note  that  we  assume  that  <£>  satisfies  the  RIP  of  order  3A'  with  constant  <5  <  0.13/ \/log2  K.  Using  Corollary  3.4 
of  [15],  we  can  replace  this  with  the  assumption  that  satisfies  the  RIP  of  order  2K  with  constant  8  <  .043/ydog fK. 


(17) 


Proof:  The  proof  works  by  showing  that  at  each  iteration, 

|^nsuPP(x)|  >  ||oS|. 

If  (17)  is  satisfied  for  0,1, . . .  —  1,  then  at  iteration  i  we  have  that 

|A£nsupp(x)|  >  i | | . 


(18) 


It  follows  that,  before  |A^|  exceeds  2 K,  we  will  have  supp(x)  C  Ae.  Because  $  satisfies  the  RIP  of 
order  3K  >  2K,  at  termination,  will  be  full  rank.  From  (3)  we  conclude  that  x^  =  x  exactly. 

To  prove  (17),  we  again  proceed  by  induction.  Hence,  we  assume  that  (17)  holds  for  0,1,...,  € —  1, 
and  thus  (18)  holds  for  iteration  £.  We  next  assume  for  the  sake  of  a  contradiction  that  (17)  does 
not  hold  for  iteration  i,  i.e. ,  that 


K\supp(x)|  >  -K|. 


(19) 


Define  the  sets  T  =  Q g  \  supp(x)  and  S  =  supp(x)  \  A':  =  supp(x^),  where  x£  is  defined  as  in  (8). 
Recall  that  we  can  write  he  =  A^A^tx.  Thus,  using  the  assumption  that  |T|  >  ^|Dq|  and  the 
facts  that  T  C  and  Oq  £  R(Qe),  one  can  show  that 


IIA'ItIU  > 


We  now  observe  that 


II^UIl2> 


=  ll^ln*ll2, 


(20) 


(21) 


2.5v/log2  K 

which  follows  from  Lemma  4.2  and  the  fact  that  Qq  is  the  maximal  regularizing  set.  From  the 
maximality  of  and  the  fact  that  (S'!  <  K,  we  have  that  1 1  ^  1 1 2  >  ||/r|s||2,  so  that  by  combining 

(20)  and  (21)  we  obtain 


w 


It II2  > 


1 


=  11^ 


S'  1 1 2  - 


(22) 


2.5i/51og2  K 

Note  that  |S'Usupp(ar)|  =  |5|  <  K  and  since  |A£|  <  2 K,  from  Lemma  3.2  we  have  that  satisfies 
the  RIP  of  order  at  least  K  with  constant  5/(1  —  6),  thus  Lemma  4.1  implies  that 

5 


||tf|s-3*|5||2< 


1-6 


Ix^h- 


Since  x^|s  =  x£,  \\he\s  -  x*|s||2  =  \\xe  -  he\s\\2  >  \\xe\\2  -  ||^|s||2,  and  thus 


ll^lslla  > 


1-2  5, 


x  2- 


Hence, 


II  he 


T  2  > 


1-5 
(1  —  26)/ (1  —  6)  £ 


x  2- 


(23) 


2.5y/5  log2  K 

On  the  other  hand,  since  |supp(x^)|  +  |A£  D  supp(x)|  =  K,  from  (18)  we  obtain  that  |supp(x£)|  < 
K  —  A | A^ | .  Thus,  |T  U  supp(a/)|  <  |T|  +  |supp(xf)|  <  2 1\  —  A | A^ | .  Furthermore,  A^t  satisfies  the 
RIP  of  order  3K  -  |A*|  =  3 K  -  A|A£|  -  ±|A£|.  Since  |A£|  <  2K,  we  have  that  A^i  satisfies  the  RIP 
of  order  at  least  2 K  —  \\Af'\  with  constant  5/(1  —  5).  Thus,  Lemma  4.1  also  implies  that 


^\t\\2  =  ll^lr  —  x/ 1 T 1 1 2  < 


1-5 


I  1 1 2  - 


(24) 
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This  is  a  contradiction  whenever  the  right-hand-side  of  (23)  is  greater  than  the  right-hand-side  of 
(24),  which  occurs  when  d  <  1/(2  +  2.5^5  log2  K).  Since  log2  K  >  1,  we  can  replace  this  with  the 
slightly  stricter  condition  5  <  l/((2  +  2.5v/5)y/log2  K)  ss  0.1317/yTog2  K.  □ 

Observe  that  when  K  =  1,  this  proof  (as  well  as  the  proofs  in  [13, 14])  break  down  since 
Lemma  4.2  does  not  apply.  However,  when  K  =  1  the  ROMP  algorithm  simply  reduces  to  OMP. 
In  this  case  we  can  apply  Theorem  3.1  to  verify  that  ROMP  succeeds  when  K  =  1  provided  that 
$  satisfies  the  RIP  of  order  2  with  isometry  constant  5  <  1/3. 


Appendix 


Proof  of  Lemma  3.1:  We  first  assume  that  ||it||2  =  ||u||2  =  1.  From  the  fact  that 

1 1 it  ±  v\\l  =  ||u||!  +  Hull!  ±  2 (u,v)  =  2  ±  2 (u,v) 
and  since  T  satisfies  the  RIP,  we  have  that 

(1  -<S)(2±2(it,u))  <  ||Ttt±Ti;||!  <  (l  +  <y)(2±2(u,u)). 


From  the  parallelogram  identity  we  obtain 
(T it ,  Tu)  =  j  (||Tu  +  Tu||!  -  ||Tu  -  Tu||!)  < 


(1 +  («,«))(!  +  <J)  -  (1  -  (u,i>»(l  -4') 


=  (it,  v )  +  5. 


Similarly,  one  can  show  that  (Tit,  Tu)  >  (it,  u)  —  S,  and  thus  |(\Pu,  Tu)  —  (it,  u)|  <  5.  The  result 
follows  for  u,  v  with  arbitrary  norm  from  the  bilinearity  of  the  inner  product.  □ 


Proof  of  Lemma  3.2:  From  the  definition  of  A A  we  may  decompose  A\u  as  A\u  =  Tit  —  PATit. 
Since  PA  is  an  orthogonal  projection,  we  can  write 

||Tit||!  =  ||PATit||!  +  pAit||i.  (25) 

Our  goal  is  to  show  that  ||Tit||2  ~  ||AAit||2,  or  equivalently,  that  ||PATii||2  is  small.  Towards  this 
end,  we  note  that  since  PATit  is  orthogonal  to  A\u, 

(PATit,  Tit)  =  (PATu,  PaTu  +  A\u)  =  (PATit,  PATtt)  +  (PATu,  Aau)  =  ||PATit|||.  (26) 

Since  PA  is  a  projection  onto  77(TA)  there  exists  a  z  €  with  supp(z)  C  A  such  that  PATu  =  T z. 
Furthermore,  by  assumption,  supp(it)  n  A  =  0.  Hence  (it,  z)  =  0  and  from  the  RIP  and  Lemma  3.1, 

|  (PATit,  Tit)  |  _  |(Tz,Tu)|  |(Tz,Tit)|  5 

||PATit||2||Tit||2  ]|Tz||2 1 1 Tit 1 1 2  “  (1  -  5)|M|2|M|2  “  l-d' 

Combining  this  with  (26),  we  obtain 

\\Pa*u\\2  <  y^||$ii||2. 

Since  we  trivially  have  that  ||PATit||2  >  0,  we  can  combine  this  with  (25)  to  obtain 

2  <  (1  + 

\  -/  / 

which  simplifies  to  (10).  □ 


1  - 


1-5 


||Tu||!  <  ||HAit| 


Since  ||it||o  <  K,  we  can  use  the  RIP  to  obtain 

(i-(A/)  ]  (i  - «)IMI!  <  Ma«| 
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